Source: www.redbubble.com Photo Credit: Nino Marcutti/Alamy Stock

1. Introduction

In mid-2020, New York City (NYC) became the epicenter of the global COVID-19 pandemic as its residents were forced to shelter in place and economic activity came to a grinding halt. The City Department of Health and Mental Hygiene (DOHMH) tracks and provide data on the number the daily and aggregate number of COVID cases in NYC on its Github repository. However, since DOHMH only provides raw data (in CSV format), it makes it difficult to digest and detect case trends in the city.

This work seeks to extract, transform and analyze and the daily and aggregate cases reported by DOHMH. It uses visuals to depict trends in COVID-19 infections, hospitalizations and deaths across the city. It also examines case trends among boroughs, demographics, and neighborhoods to understand which group is being impacted the most by the pandemic.

The analysis will be updated at the beginning of each week as new data become available to allows for continuous monitoring of COVID-19 trends in NYC.

2. Data

NYC DOHMH publishes an open source COVID-19 database on its Github repository. The database, which is updated daily, contains numerous tables that provides details about COVID cases, testing and vaccinations. This analyses uses uses three data sets from the repository, namely data-by-day, data-by-group and data-by-modzcta. Below are brief descriptions of each of the data sets.

  • data-by-day: Provides a daily summary of all Covid cases, hospitalizations and deaths that happened in the City as a whole, and by borough.

  • data-by-group: Provides a breakdown of total number of cases, hospitalizations and death by different demograpics, including borough, age, gender, and race.

  • data-by-modzcta: Gives a breakdown of aggregate cases by neighborhood and modified zip code. This data can be used to map COVID cases and deaths by neighborhood when combined with the MODZCTA shape files (can be downloaded from DOHMH’s Github or NYC Open Data Portal).

In addition to the three highlighted above, the analysis also extracts and uses shapefile data from the City’s Open Data Portal to map COVID cases in neighborhoods.

Now, let us extract and load the aforementioned data sets (from DOHMH GitHub page and NYC Open Data Portal) and get them ready for the analysis.

## [1] TRUE TRUE TRUE TRUE TRUE
## Reading layer `geo_export_65e02036-af81-4de3-b7b0-5e1f1e2e0a3e' from data source `/Users/aly_will_mac/Desktop/OLD PC/WILL/LEARNING/1. ALL PROJECTS/R-NYC-COVID-Stats/Shape Files/geo_export_65e02036-af81-4de3-b7b0-5e1f1e2e0a3e.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 178 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -74.25559 ymin: 40.49612 xmax: -73.70001 ymax: 40.91553
## Geodetic CRS:  WGS84(DD)

The tables below show the first few rows of each data set.

Data-by-Day

Data-by-Group

Data-by-Modzcta

3. Data Examination

In this section, I examine the data sets to identify what needs to cleaned.

3.1. Data Structure and Summary

The tables below depict the structure and summary of the three COVID data sets.

Daily Data

skim_type skim_variable n_missing complete_rate character.min character.max character.empty character.n_unique character.whitespace numeric.mean numeric.sd numeric.p0 numeric.p25 numeric.p50 numeric.p75 numeric.p100 numeric.hist
character date_of_interest 0 1 10 10 0 1060 0 NA NA NA NA NA NA NA NA
numeric CASE_COUNT 0 1 NA NA NA NA NA 2527.1245283 5078.2973956 0 617.75 1539.5 2863.00 54999 ▇▁▁▁▁
numeric PROBABLE_CASE_COUNT 0 1 NA NA NA NA NA 492.0377358 635.9121400 0 90.00 377.0 671.25 5882 ▇▁▁▁▁
numeric HOSPITALIZED_COUNT 0 1 NA NA NA NA NA 175.3084906 248.1974756 1 48.00 104.0 190.25 1842 ▇▁▁▁▁
numeric DEATH_COUNT 0 1 NA NA NA NA NA 35.9528302 77.1863059 0 7.00 13.0 32.00 598 ▇▁▁▁▁
numeric PROBABLE_DEATH_COUNT 0 1 NA NA NA NA NA 5.9783019 24.7973182 0 0.00 1.0 2.00 240 ▇▁▁▁▁
numeric CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 2523.2584906 4662.9357200 0 630.25 1581.0 2865.25 39493 ▇▁▁▁▁
numeric ALL_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 3013.9584906 5216.2333481 0 778.75 2003.0 3622.50 43950 ▇▁▁▁▁
numeric HOSP_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 175.1009434 243.8682523 0 48.00 106.0 190.00 1663 ▇▁▁▁▁
numeric DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 35.9094340 76.2970520 0 8.00 12.0 32.00 566 ▇▁▁▁▁
numeric ALL_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 41.8867925 99.7462144 0 8.00 13.0 34.25 775 ▇▁▁▁▁
numeric BX_CASE_COUNT 0 1 NA NA NA NA NA 417.1660377 948.3834150 0 80.75 211.5 443.50 10559 ▇▁▁▁▁
numeric BX_PROBABLE_CASE_COUNT 0 1 NA NA NA NA NA 96.9745283 145.9417346 0 13.00 65.0 133.00 1575 ▇▁▁▁▁
numeric BX_HOSPITALIZED_COUNT 0 1 NA NA NA NA NA 37.7707547 57.4190837 0 9.00 20.0 40.00 390 ▇▁▁▁▁
numeric BX_DEATH_COUNT 0 1 NA NA NA NA NA 6.7990566 16.1710112 0 1.00 2.0 5.00 132 ▇▁▁▁▁
numeric BX_PROBABLE_DEATH_COUNT 0 1 NA NA NA NA NA 1.1603774 5.1219994 0 0.00 0.0 0.00 46 ▇▁▁▁▁
numeric BX_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 416.4349057 851.9053169 0 81.75 236.0 461.25 7479 ▇▁▁▁▁
numeric BX_PROBABLE_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 96.6896226 134.1724500 0 14.00 73.0 138.25 1094 ▇▁▁▁▁
numeric BX_ALL_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 513.1311321 977.6351710 0 107.00 307.0 602.25 8573 ▇▁▁▁▁
numeric BX_HOSPITALIZED_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 37.7132075 55.9878310 0 9.00 21.0 39.00 358 ▇▁▁▁▁
numeric BX_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 6.8094340 15.8448422 0 1.00 2.0 5.00 117 ▇▁▁▁▁
numeric BX_ALL_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 7.9584906 20.6659700 0 1.00 2.0 5.00 158 ▇▁▁▁▁
numeric BK_CASE_COUNT 0 1 NA NA NA NA NA 761.7547170 1498.3876504 0 215.00 464.0 849.50 16664 ▇▁▁▁▁
numeric BK_PROBABLE_CASE_COUNT 0 1 NA NA NA NA NA 134.7622642 177.6639518 0 28.00 102.0 177.25 1906 ▇▁▁▁▁
numeric BK_HOSPITALIZED_COUNT 0 1 NA NA NA NA NA 53.1433962 72.9644525 0 16.00 31.0 56.25 555 ▇▁▁▁▁
numeric BK_DEATH_COUNT 0 1 NA NA NA NA NA 11.2283019 23.8219067 0 2.00 4.0 10.00 201 ▇▁▁▁▁
numeric BK_PROBABLE_DEATH_COUNT 0 1 NA NA NA NA NA 2.0226415 8.5817436 0 0.00 0.0 1.00 92 ▇▁▁▁▁
numeric BK_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 760.6339623 1382.4148767 0 226.50 481.0 858.00 11586 ▇▁▁▁▁
numeric BK_PROBABLE_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 134.4273585 164.9582641 0 27.75 107.0 175.25 1213 ▇▁▁▁▁
numeric BK_ALL_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 895.0792453 1536.0343325 0 259.00 593.5 1043.50 12786 ▇▁▁▁▁
numeric BK_HOSPITALIZED_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 53.0792453 71.3291016 0 17.00 32.0 53.00 490 ▇▁▁▁▁
numeric BK_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 11.2245283 23.3750180 0 2.00 4.0 10.00 178 ▇▁▁▁▁
numeric BK_ALL_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 13.2415094 31.3377557 0 3.00 4.0 11.00 252 ▇▁▁▁▁
numeric MN_CASE_COUNT 0 1 NA NA NA NA NA 463.4301887 920.8203633 0 105.00 287.0 495.00 9113 ▇▁▁▁▁
numeric MN_PROBABLE_CASE_COUNT 0 1 NA NA NA NA NA 90.9981132 114.9317078 0 18.00 70.5 125.25 972 ▇▁▁▁▁
numeric MN_HOSPITALIZED_COUNT 0 1 NA NA NA NA NA 26.5235849 36.3131175 0 7.00 16.0 31.00 275 ▇▁▁▁▁
numeric MN_DEATH_COUNT 0 1 NA NA NA NA NA 4.9179245 10.0653136 0 1.00 2.0 5.00 92 ▇▁▁▁▁
numeric MN_PROBABLE_DEATH_COUNT 0 1 NA NA NA NA NA 0.8198113 3.2486772 0 0.00 0.0 0.00 33 ▇▁▁▁▁
numeric MN_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 462.7660377 839.4616940 0 119.00 316.0 488.50 6394 ▇▁▁▁▁
numeric MN_PROBABLE_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 90.7924528 108.4387633 0 17.00 75.0 129.00 766 ▇▁▁▁▁
numeric MN_ALL_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 553.5547170 941.3468521 0 146.50 380.0 604.00 7160 ▇▁▁▁▁
numeric MN_HOSPITALIZED_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 26.4820755 35.3165459 0 6.00 17.0 31.00 228 ▇▁▁▁▁
numeric MN_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 4.9094340 9.7595892 0 1.00 2.0 4.00 73 ▇▁▁▁▁
numeric MN_ALL_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 5.7188679 12.7291173 0 1.00 2.0 5.00 100 ▇▁▁▁▁
numeric QN_CASE_COUNT 0 1 NA NA NA NA NA 705.7452830 1430.4416288 0 146.00 407.0 802.25 15221 ▇▁▁▁▁
numeric QN_PROBABLE_CASE_COUNT 0 1 NA NA NA NA NA 136.2103774 171.6479589 0 21.00 100.5 195.00 1609 ▇▁▁▁▁
numeric QN_HOSPITALIZED_COUNT 0 1 NA NA NA NA NA 49.2679245 76.7668675 0 13.00 27.0 52.00 609 ▇▁▁▁▁
numeric QN_DEATH_COUNT 0 1 NA NA NA NA NA 10.7547170 24.3999997 0 2.00 4.0 9.00 202 ▇▁▁▁▁
numeric QN_PROBABLE_DEATH_COUNT 0 1 NA NA NA NA NA 1.7169811 7.3487965 0 0.00 0.0 1.00 68 ▇▁▁▁▁
numeric QN_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 704.6490566 1322.4917847 0 149.75 436.5 828.00 11550 ▇▁▁▁▁
numeric QN_PROBABLE_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 135.8471698 161.0818396 0 21.00 104.0 198.25 1220 ▇▁▁▁▁
numeric QN_ALL_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 840.4971698 1469.4601416 0 184.00 545.5 1052.75 12687 ▇▁▁▁▁
numeric QN_HOSPITALIZED_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 49.2264151 75.3578411 0 13.00 28.0 52.00 562 ▇▁▁▁▁
numeric QN_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 10.7396226 24.0074771 0 2.00 4.0 9.00 177 ▇▁▁▁▁
numeric QN_ALL_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 12.4603774 30.9072959 0 2.00 4.0 10.00 240 ▇▁▁▁▁
numeric SI_CASE_COUNT 0 1 NA NA NA NA NA 178.1754717 332.4948406 0 42.00 111.0 197.25 3720 ▇▁▁▁▁
numeric SI_PROBABLE_CASE_COUNT 0 1 NA NA NA NA NA 33.0283019 36.5226352 0 5.75 26.0 49.00 316 ▇▁▁▁▁
numeric SI_HOSPITALIZED_COUNT 0 1 NA NA NA NA NA 10.7018868 11.8029860 0 3.00 7.0 14.00 83 ▇▂▁▁▁
numeric SI_DEATH_COUNT 0 1 NA NA NA NA NA 2.2509434 3.8622711 0 0.00 1.0 3.00 34 ▇▁▁▁▁
numeric SI_PROBABLE_DEATH_COUNT 0 1 NA NA NA NA NA 0.2584906 0.9819781 0 0.00 0.0 0.00 9 ▇▁▁▁▁
numeric SI_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 177.9216981 305.2646687 0 42.00 118.0 199.25 2686 ▇▁▁▁▁
numeric SI_PROBABLE_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 32.8792453 33.9528160 0 6.00 27.0 50.00 233 ▇▃▁▁▁
numeric SI_ALL_CASE_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 210.7924528 334.0528820 0 48.75 149.0 250.00 2906 ▇▁▁▁▁
numeric SI_HOSPITALIZED_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 10.6811321 11.2766827 0 3.00 8.0 13.00 72 ▇▂▁▁▁
numeric SI_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 2.2301887 3.5835942 0 1.00 1.0 2.00 26 ▇▁▁▁▁
numeric SI_ALL_DEATH_COUNT_7DAY_AVG 0 1 NA NA NA NA NA 2.4990566 4.3994376 0 1.00 1.0 3.00 34 ▇▁▁▁▁
numeric INCOMPLETE 0 1 NA NA NA NA NA 402.6320755 4940.5862792 0 0.00 0.0 0.00 60970 ▇▁▁▁▁

Group Data

skim_type skim_variable n_missing complete_rate character.min character.max character.empty character.n_unique character.whitespace numeric.mean numeric.sd numeric.p0 numeric.p25 numeric.p50 numeric.p75 numeric.p100 numeric.hist
character group 0 1.0000000 3 9 0 6 0 NA NA NA NA NA NA NA NA
character subgroup 0 1.0000000 0 22 1 27 0 NA NA NA NA NA NA NA NA
numeric CONFIRMED_CASE_RATE 1 0.9629630 NA NA NA NA NA 30503.3977 4679.7023 19004.50 27864.062 30666.865 33052.327 39665.81 ▁▃▇▅▃
numeric CASE_RATE 1 0.9629630 NA NA NA NA NA 36444.4442 5691.9720 22559.28 33003.207 36720.275 39326.058 47018.65 ▁▃▇▅▃
numeric HOSPITALIZED_RATE 1 0.9629630 NA NA NA NA NA 2437.4304 1997.9331 241.72 1439.553 2357.615 2585.832 10635.32 ▇▇▁▁▁
numeric DEATH_RATE 3 0.8888889 NA NA NA NA NA 658.6129 824.7574 2.70 358.265 542.565 607.050 4235.86 ▇▁▁▁▁
numeric CONFIRMED_CASE_COUNT 1 0.9629630 NA NA NA NA NA 592645.0000 545165.0400 99530.00 272624.750 435353.000 702846.250 2678752.00 ▇▃▁▁▁
numeric PROBABLE_CASE_COUNT 1 0.9629630 NA NA NA NA NA 115452.8462 106379.3686 18617.00 56596.500 88915.000 141340.750 521560.00 ▇▂▁▁▁
numeric CASE_COUNT 1 0.9629630 NA NA NA NA NA 708097.8462 651324.5863 118147.00 329874.250 526533.000 847702.250 3200312.00 ▇▃▁▁▁
numeric HOSPITALIZED_COUNT 1 0.9629630 NA NA NA NA NA 46099.4231 42720.9188 1665.00 17391.750 37600.500 58816.500 203792.00 ▇▅▁▁▁
numeric DEATH_COUNT 3 0.8888889 NA NA NA NA NA 12205.3750 10682.4942 46.00 5244.500 9365.500 19642.250 44585.00 ▇▂▅▁▁

Modzcta Data

skim_type skim_variable n_missing complete_rate character.min character.max character.empty character.n_unique character.whitespace numeric.mean numeric.sd numeric.p0 numeric.p25 numeric.p50 numeric.p75 numeric.p100 numeric.hist
character NEIGHBORHOOD_NAME 0 1 7 59 0 162 0 NA NA NA NA NA NA NA NA
character BOROUGH_GROUP 0 1 5 13 0 5 0 NA NA NA NA NA NA NA NA
character label 0 1 5 12 0 177 0 NA NA NA NA NA NA NA NA
numeric MODIFIED_ZCTA 0 1 NA NA NA NA NA 10810.37853 5.781733e+02 10001.00000 10301.00000 11109.00000 11361.00000 11697.00000 ▇▃▁▇▇
numeric lat 0 1 NA NA NA NA NA 40.72555 8.364830e-02 40.50777 40.67082 40.72644 40.77643 40.89951 ▁▅▇▇▃
numeric lon 0 1 NA NA NA NA NA -73.91881 9.965940e-02 -74.24227 -73.97870 -73.92405 -73.84698 -73.71091 ▁▁▇▆▃
numeric COVID_CONFIRMED_CASE_COUNT 0 1 NA NA NA NA NA 14516.59322 8.186873e+03 1038.00000 8408.00000 13499.00000 20404.00000 33988.00000 ▆▇▆▅▂
numeric COVID_PROBABLE_CASE_COUNT 0 1 NA NA NA NA NA 2861.22034 1.539180e+03 207.00000 1785.00000 2616.00000 3981.00000 7082.00000 ▅▇▅▃▁
numeric COVID_CASE_COUNT 0 1 NA NA NA NA NA 17377.81356 9.605003e+03 1292.00000 10259.00000 15948.00000 24337.00000 39937.00000 ▆▇▆▅▂
numeric COVID_CONFIRMED_CASE_RATE 0 1 NA NA NA NA NA 30934.32780 4.638634e+03 19013.04000 27863.59000 30245.67000 33423.36000 48296.47000 ▁▇▆▁▁
numeric COVID_CASE_RATE 0 1 NA NA NA NA NA 37235.82288 5.170696e+03 24264.42000 33774.31000 36345.32000 39695.18000 58541.63000 ▁▇▃▁▁
numeric POP_DENOMINATOR 0 1 NA NA NA NA NA 47100.66136 2.615157e+04 2972.12000 27180.77000 42737.28000 66856.31000 110369.78000 ▅▇▅▃▂
numeric COVID_CONFIRMED_DEATH_COUNT 0 1 NA NA NA NA NA 211.45763 1.524139e+02 0.00000 91.00000 170.00000 314.00000 781.00000 ▇▅▃▁▁
numeric COVID_PROBABLE_DEATH_COUNT 0 1 NA NA NA NA NA 35.23729 2.660870e+01 0.00000 15.00000 28.00000 48.00000 109.00000 ▇▆▃▂▁
numeric COVID_DEATH_COUNT 0 1 NA NA NA NA NA 246.69492 1.770593e+02 1.00000 110.00000 201.00000 364.00000 884.00000 ▇▅▃▂▁
numeric COVID_CONFIRMED_DEATH_RATE 0 1 NA NA NA NA NA 425.20243 1.885513e+02 0.00000 330.23000 422.10000 521.66000 1305.45000 ▃▇▃▁▁
numeric COVID_DEATH_RATE 0 1 NA NA NA NA NA 496.28921 2.200567e+02 11.42000 380.20000 489.32000 618.73000 1528.15000 ▃▇▃▁▁
numeric PERCENT_POSITIVE 0 1 NA NA NA NA NA 24.86322 4.654500e+00 7.89000 22.61000 25.40000 27.18000 36.33000 ▁▁▆▇▁
numeric TOTAL_COVID_TESTS 0 1 NA NA NA NA NA 53736.57062 2.932419e+04 4242.00000 30822.00000 48733.00000 75421.00000 129062.00000 ▆▇▆▃▂

From the tables above, the only things that needs to be addressed is missing values in the group data.

3.2. Missing Data

From the data summary tables, only the group data has missing values. Let us check again to make sure.

Daily Data

Group Data

Modzcta Data

As depicted by the charts, about four percent of the observations in the group data are missing. After further review, it is clear that all the missing observations are from the Age group category under group column. In Section 4.1, I re-code the age groups under the subgroup.

4. Data Wrangling

To get the ready for the analysis, I proceed to clean and manipulate them by executing the following actions.

  • group data: I combined some of the age categories into one to remove the missing values. I also added corrected borough name from StateIsland to Staten Island.

  • daily data: I changed the data type of date-of-interest variable from character to date.

4.1. Consolidate ‘0-17’ age group

Under the Age group category, the 0-17 group has three sub-groupings (0-4, 5-12, 12-17). However, the DEATH_RATE & DEATH_COUNT statistics are only provided for the 0-17 age group. Besides DEATH_RATE and DEATH_COUNT, the other COVID statistics are only provided for the age sub-categories and not the main 0-17 category. This creates missing values in the rows containing the age categories as shown below.

To handle the missing data, I use the rollsumr function in R to aggregate statistics for the three sub-categories (0-4, 5-12, 12-17) under the main category (0-17). The sub-categories are subsequently deleted from the table.

Now, check to see if the re-coding took care of the missing values in the group data.

4.2. Clean the Staten Island subgroup

In the group table, ‘Staten Island’ is written as StatenIsland as shown in the table below.

Borough Count
Bronx 1
Brooklyn 1
Manhattan 1
Queens 1
StatenIsland 1

I clean it by adding a white space to correct the name of the Borough, as show below.

Borough Count
Bronx 1
Brooklyn 1
Manhattan 1
Queens 1
Staten Island 1

4.3. Change data types for date_of_interest

In the daily data, the date_of_interest column is stored as a string variable. I change it to a date variable.

## [1] "Date"

5. Analyzing Citywide Impact

This section analyzes the daily and total number of COVID cases in the City as a whole as of 2023-01-23.

5.1. Citywide: Total Cases

As of 2023-01-23, approximately 3.2 million COVID infections have been recorded in NYC, with close to 204,000 of those infection leading to hospitalization. Exactly 44,585 people have lost their lives from COVID-19 in the City.

Total Infections Total Hospitalizations Total Deaths
3,200,312 203,792 44,585

The charts below show the trends in daily Citywide cases since the beginning of the pandemic.

Infections

Hospitalizations

Deaths

The charts above show that NYC reached the peak of infection in the beginning of 2022, during the Omicron wave. While there have been three waves in hospitalizations and deaths, most of the hospitalizations and deaths occurred during the initial wave of infections (between March and April of 2020). The availability of vaccines during the Omicron wave appear to have helped reduce the number of hospitalizations and deaths around that time.

5.2. Citywide: New Cases

The table below shows the number of new infections, hospitalizations and deaths recorded on 2023-01-23 - the latest date we have record for.

Date Infections Hospitalizations Deaths
2023-01-23 1,759 10 6

6. Analyzing COVID Impact by Borough

This section disaggregates the daily and total number of COVID cases among the five NYC boroughs.

6.1. Total Cases by Borough

The chart below shows the total number of COVID cases by borough. It gives the raw numbers of infections, hospitalizations and deaths since the beginning of the pandemic. Because we are looking at raw numbers (and not numbers adjusted for population), densely populated boroughs will show more infections, hospitalizations and deaths.

6.2. Daily Average Cases by Borough

The charts below show the trends in the daily average infections, hospitalizations and deaths per borough.

Average Infections by Borough

Average Hospitalizations by Borough

Average Deaths by Borough

The charts above show that daily infections, hospitalizations and deaths have consistently been highest in Brooklyn and Queen.

6.3. Share of Infections that Turned into Hospitalizations and Deaths?

The chart below shows total hospitalizations and deaths as percent of total infections for each borough.

The chart above indicates that, even though Brooklyn has had the highest number of COVID cases (see section 6.1), the Bronx has seen the largest share of its cases lead to hospitalizations and deaths. This may be because, although less populated than Brooklyn, the Bronx has a lot more people (per capita) living with underlying medical conditions that exacerbate the effects of COVID. For example, the Bronx is known to have one of the highest asthma hospitalization rate in the New York State.1

6.4. Which Boroughs Have Been Hit the Hardest?

Section 6.1 shows Brooklyn has the highest number of cases, hospitalizations and deaths among all boroughs. This makes sense since Brooklyn is the most populous of the five boroughs. However, to be able to compare boroughs to determine which one has been severely affected, we have to adjust for population. Hence, we use the rates (per 100,000) statistics.

Below are the infection, hospitalization and death rates (per 100,000) for each borough.

The chart above indicates that after adjusting for population, Staten Island - the least populated borough - has the highest rate of infections. The Bronx, on the hand, has the highest rate of hospitalizations and deaths.

7. Analysing by Age Group

This section details how COVID-19 has impacted NYC residents of different age groups. The data set breaks down age into eight categories - 0-17, 18-24, 24-34, 35-44, 45-54, 55-64, 65-74, and 75+.

7.1. Case, Hospitalization and Death Rates

The first tab shows the infection, hospitalization and death rates (per 100,000) for the various age groups. The second tab shows hospitalization and death rates as a share of case rates.

Case, Hospitalization and Death Rates

Share of Cases that Lead to Hospitalization or Death

The two tables indicate that, while young people (under 45 years) are infected at higher rates than any other age group, only a small share are hospitalized and they barely any die from the virus. On the other hand, seniors, especially those 75 year and over, tend to be hospitalized and die at the highest rate even though they have the lowest infection rates. This is consistent with reports that COVID is much more deadly among seniors.

8. Analysis by Race/Ethnicity

This section details how COVID has affected people of different racial and ethnicity background. The data sets breaks race/ethnicity into four categories - Asian/Pacific-Islander, Black/African-American, Hispani/Latino and White.

Case, Hospitalization and Death Rates

The first tab shows the infections, hospitalizations and deaths rates (per 100,000) for each race/ethnicity.

The second tab shows hospitalization and death rates as a share of case rates.

Case, Hospitalization and Death Rates

Share of Cases that Cause Hospitalization or Death

The two charts indicate that, while African-Americans have one of the lowest infections rates, they tend to be hospitalized or die from the virus at the highest rates.

9. Map: COVID-19 Cases by Neighborhood

In this section, I use choropleth maps to visualize and compare infection and death rates (per 100K) among NYC neighborhoods.

To create the maps, i merge the modzcta dataframe (which disaggregates total COVID cases by zip code and neighborhoods) and the modzcta shapefile.

Infection Rate (per 100K) by Neighborhood

Death Rate (Per 100K) by Neighborhood

10. Conclusion

The following are the trends in observed reported COVID cases in NYC as of 2023-01-23.

  • Infections peaked in January 2022, during the Omicron wave.

  • However, hospitalizations and deaths reached their peaks during the first wave of the pandemic (April 2020). Because of the availability of vaccines, the Omicron wave did not cause as much hospitalization and was not as deadly as the 2020 wave of infections.

  • Because of the size of its population, Brooklyn has seen the highest number of infections, hospitalizations and deaths since the beginning of the pandemic compared to the other boroughs.

    • However, when you adjust for population, Staten Island has the highest rate of infection (per 100K people), while the Bronx has had the highest rate of hospitalization and death rates.
    • The Bronx has seen the largest share of all cases lead to hospitalization (8 percent) and death (1.6 percent).
  • Brooklyn and Queens have consistently averaged the highest number of infections, hospitalizations and deaths since the beginning of the pandemic per day.

  • In terms of age, young people under 45 years have the highest rate of infection. Yet, seniors over 65 years tend to be hospitalized and die at the highest rates.

    • Those over 76 years have seen the largest share of their cases lead to hospitalization (34.5 percent) and deaths (13.8 percent).
  • Even though African-Americans have one of the lowest infection rates, they tend to be hospitalized and die at higher rates compared other races/ethnicities.

    • African-Americans have seen the largest share of their cases lead to hospitalization (9.6 percent) and deaths (3.3 percent).2